NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

On Convex Optimization with Semi-Sensitive Features

Ghazi, Badih; Kamath, Pritish; Kumar, Ravi; Manurangsi, Pasin; Meka, Raghu; Zhang, Chiyuan (August 2025, 10.48550/arXiv.2406.19040.)

We study the differentially private (DP) empirical risk minimization (ERM) problem under the semi-sensitive DP setting where only some features are sensitive. This generalizes the Label DP setting where only the label is sensitive. We give improved upper and lower bounds on the excess risk for DP-ERM. In particular, we show that the error only scales polylogarithmically in terms of the sensitive domain size, improving upon previous results that scale polynomially in the sensitive domain size (
more » « less
Free, publicly-accessible full text available August 29, 2026
Empirical Privacy Variance

Hu, Yuzheng; Wu, Fan; Xian, Ruicheng; Liu, Yuhang; Zakynthinou, Lydia; Kamath, Pritish; Zhang, Chiyuan; Forsyth, David (July 2025, openreview.net)

Free, publicly-accessible full text available July 19, 2026
Learning Neural Networks with Sparse Activations

Awasthi, Pranjal; Dikkala, Nishant; Kamath, Pritish; Meka, Raghu (June 2024, Journal of machine learning research)
Learning Neural Networks with Sparse Activations

Awasthi, Pranjal; Dikkala, Nishanth; Kamath, Pritish; Meka, Raghu (June 2024, Proceedings of the 37th Conference on Learning Theory (COLT 2024))

A core component present in many successful neural network architectures, is an MLP block of two fully connected layers with a non-linear activation in between. An intriguing phenomenon observed empirically, including in transformer architectures, is that, after training, the activations in the hidden layer of this MLP block tend to be extremely sparse on any given input. Unlike traditional forms of sparsity, where there are neurons/weights which can be deleted from the network, this form of {\em dynamic} activation sparsity appears to be harder to exploit to get more efficient networks. Motivated by this we initiate a formal study of PAC learnability of MLP layers that exhibit activation sparsity. We present a variety of results showing that such classes of functions do lead to provable computational and statistical advantages over their non-sparse counterparts. Our hope is that a better theoretical understanding of {\em sparsely activated} networks would lead to methods that can exploit activation sparsity in practice.
more » « less
Full Text Available
On Convex Optimization with Semi-Sensitive Features

Ghazi, Badih; Kamath, Pritish; Kumar, Ravi; Manurangsi, Pasin; Meka, Raghu; Zhang, Chiyuan (June 2024, Journal of machine learning research)
User-Level Differential Privacy With Few Examples Per User

Ghazi, Badih; Kamath, Pritish; Kumar, Ravi; Manurangsi, Pasin; Meka, Raghu; Zhang, Chiyuan (December 2023, Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, NeurIPS 2023)
Ticketed learning--unlearning schemes

Ghazi, Badih; Kamath, Pritish; Kumar, Ravi; Manurangsi, Pasin; Sekhari, Ayush; Zhang, Chiyuan (July 2023, Conference on Learning Theory)

Full Text Available
Circuits Resilient to Short-Circuit Errors

https://doi.org/10.1145/3519935.3520007

Efremenko, Klim; Haeupler, Bernhard; Tauman Kalai, Yael; Kamath, Pritish; Kol, Gillat; Resch, Nicolas; Saxena; Raghuvansh (July 2022, Proceedings of the ACM Symposium on Theory of Computing)

Full Text Available
Understanding the Eluder Dimension

Li, Gene; Kamath, Pritish; Foster, Dylan J.; Srebro, Nati (January 2022, Advances in neural information processing systems)

Full Text Available
Quantifying the Benefit of Using Differentiable Learning over Tangent Kernels

Malach, Eran; Kamath, Pritish; Abbe, Emmanuel; Srebro, Nathan (July 2021, Proceedings of Machine Learning Research)
null (Ed.)
We study the relative power of learning with gradient descent on differentiable models, such as neural networks, versus using the corresponding tangent kernels. We show that under certain conditions, gradient descent achieves small error only if a related tangent kernel method achieves a non-trivial advantage over random guessing (a.k.a. weak learning), though this advantage might be very small even when gradient descent can achieve arbitrarily high accuracy. Complementing this, we show that without these conditions, gradient descent can in fact learn with small error even when no kernel method, in particular using the tangent kernel, can achieve a non-trivial advantage over random guessing.
more » « less
Full Text Available

« Prev Next »

Search for: All records